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Abstract 



^-H ' The performance of multiple hypothesis testing is known to be affected by 

. the statistical dependence among random variables involved. The mechanisms 

responsible for this, however, are not well understood. We study the effects 
of the dependence structure of a finite state hidden Markov model (HMM) on 
the likelihood ratios critical for optimal multiple testing on the hidden states. 
Various convergence results are obtained for the likelihood ratios as the obser- 
vations of the HMM form an increasing long chain. Analytic expansions of the 
. first and second order derivatives are obtained for the case of binary states, 

J> ' explicitly showing the effects of the parameters of the HMM on the likelihood 

ratios. 

in 
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1 Introduction 



^ ' Statistical dependence in data poses a challenge to multiple hypothesis testing. 

^ . Under the framework of the false discovery rate (FDR), many efforts have been made 

to establish the control of FDR under dependence [5; 14; 25; 27; 29]. Meanwhile, 
many empirical and analytical works have described the effects of dependence on 
the outputs of multiple tests [12; 16; 22; 23]. However, in what way the dependence 
impacts multiple testing is not well understood. 

A useful model that incorporates tractable dependence in multiple testing is the 
hidden Markov model (HMM) [27]. In the model, the nulls are organized as Ht, 
where the index t takes integer values. Each null Ht is associated with a random 
variable that determines whether the null is true or false. The random variables 
form a Markov chain but are hidden and unobservable. Instead, the observations Xt 
each is a many-to-one transform of the hidden variable corresponding to Hf. In the 
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context of multiple testing, it will be useful to treat the hidden variable as consisting 
of two parts, rjt and Zf. On the one hand, r/t encodes the "true identity", or state of 
the signal associated with Ht and in general can take two or more possible values. 
On the other, Zf acts as the noise that blurs or distorts the signal. Then Xt can be 
thought of as the result of a deterministic interaction between r]t and Zf. 

To understand the role of dependence in the multiple tests on the nulls, the 
"oracle" approach assumes the parameters in the HMM are known and explores 
what amounts to an optimal testing procedure. The advantage of this approach 
is that it can reveal effects purely due to dependence, without confounding with 
effects due to specific parameter estimation methods. Suppose the observations are 
. . • , Xn- With the parameters being known, for each null Ht, the conditional 
likelihood Pr {Ht is true | . . . , Xn} can be computed. The importance of the 

conditional likelihood for multiple testing has been shown in various contexts [(i; 
21; 26; 27]. For the HMM, [27] shows that under a certain loss function, an optimal 
procedure is to reject Ht if and only if the corresponding conditional likelihood is 
small enough. The loss function is a linear combination of the numbers of type I and 
II errors and is related to the FDR. The importance of the conditional likelihood can 
also be argued directly based on the FDR criterion, and in fact without particular 
assumption on dependence; see the Appendix. 

In view of the role of the conditional likelihood, our aim is to investigate how it 
is affected by the parameters of the HMM. The parameters can be divided into two 
types, respectively characterizing the dependence among ijt and the "strength" of 
useful signals. In addition, the conditional likelihood also depends on how ijt and Zt 
interact. The next example illustrates what role may be expected for these factors. 

Example 1.1 Suppose the states r]t are equal to l{Ht is false} and form a sta- 
tionary Markov chain with transition probabilities qij = Pr{Xt = j \ Xt-i = i} > 0; 
moreover, conditional onrj = {r]t), Xt are independent ~ N{er]t, 1) with e > 0. Write 
Xt = Zt + eijt. Then {Zt,r}t) form a hidden Markov chain, with Zt iid ~ A^(0, 1). 
The strength of the signals is measured by e, the interaction between the noise Zt 
and r]t is additive, such that Xt = ip{Zt,er]t) with ip{z, i}) = z + 1). 

In many cases, the observations form a long chain X-m, . . . , X„, with m, n ^ 1, 
so the effect of the parameters can be studied through the properties of 

Pr{Tjt = 0\X}= lim Pr{rit = 0\X_m,...,Xn} 

m.n— too 

for each t, where X = {Xt, t G Z). Since Pr {rjt = | X^m, ■ ■ ■ , Xn} form a martin- 
gale for any increasing sequences of m and n, the (almos sure) existence of the limit 
is guaranteed. However, this says nothing about how the limit depends on e and 
Qij. To get some insight, consider instead the likelihood ratio 

Prfa = l|X} _ 1 ^ 

Pr{r]t = 0\X} Pr{7]t = 0\X} ' 

which turns out to be a little more convenient to study. 
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As e — > 0, the signals are increasingly weaker, making their identification more 
and more challenging. To find out how the above ratio behaves in this weak-signal 
senario, without loss of generality, let t = 0. Note that since rj is stationary, 
Pr{X(_i =j\Xt = i} = Qij. Then, by the Bayes rule and Markov property, for 
a = 0, 1, 



Pr{% = 

oc P{a) 



a\X. 



exp 



1 " 

9 ^ (^t + er]t - eat 




where P{a) 
one can get 



Pr{rjQ = a}. Take the logarithm of the likelihood ratio. Formally, 



de 



In 



Pr{7?o = l|X} 
Pr{rjo = 0\X} 



lim — 



In 



Pr{% 



J2 Zt[Pr{7jt = l\7]o = l} 



Pr{7?o = 0|X_ 
Pr{7?t = l|7?o = 0}] 



. . . , Xn} 

oo 



where r 7^ 1 is one of the two eigenvalues of the matrix {qij), the other being 1. 

The result can be read as follows. If the information of the dependence (i.e., qij) 
is not available, but the values of all other parameters are known, including P{a), 
then the likelihood ratio would have to be evaluated as 

^2 



Pr{77o = l|Xo} ^ P(l) fjXo - e) ^ P(l) 
Pr{77o = 0|Xo} P(0) fiXo) P(0) 



exp 



eZo + —{2rio 



1) 



where f{x) is the density of A^(0, 1). That is, the ratio is the so-called "local FDR" 
divided by P(0) [13]. 

For the time being, let us call the conditional likelihood ratio based on the entire 
X the full likelihood ratio (FLR), and that only based on Xq the local likelihood 
ratio (LLR). It is then easy to see that for e ~ 0, 



In 



FLR 
LLR 



e 2^r\'\Zt. 
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Thus, at the first order, the dependence in r/ merely adds more noise but no "net 
effect", regardless of the actual values of rj. If there is any state-dependent effect, 
it should be reflected in a higher order term of e. To see if this is the case, take 
the second order derivative in e. Again, the calculation can be done formally. To 
evaluate the state-dependent net effect, proceed with 



E[(lnFLR);U|%] 



lim E 

m,n—fOO 



(2% 



de^ 



In 



\2t\ 



1 x_ 



Pr{r?o 
Pr{r/o = 0|X_ 



1 Xn} 







- £=0 
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giving 



In 



FLR\" 



LLR 



e=0 



(2770-1) 



.2|t| 



It follows that, comparing to In LLR, if 7]q = 1, on average InFLR is larger, making 
Hq more likely to be (correctly) rejected, whereas if i]q = 0, it is smaller, making 
Hq less likely to be (falsely) rejected. 

From the expansions, the effect of the dependence in r] on the likelihood ratio is 
apparent. In both the first and second order derivatives, the effect is determined by 
r. In particular, when r = 0, r/t are iid and FLR is equal to LLR. Consistent with 
this, the derivatives of the difference between the two ratios become 0. □ 



As the example, the rest of the paper studies the derivatives of pr|^|lo j x} °^ 
logarithm with respect to e and the relationships between the derivatives and the 
parameters in the HMM. Since X = (Xt) is generated with a fixed e, the derivatives 
should be interpreted as follows. During the differentiation, both the signal rj and 
noise Z are fixed. As the strength e of the signal varies, the observed values Xt 
become functions of e. The likelihood ratio is affected by e in two ways: not only 
the value of Xt is changed, but also the parametric form of the density function of 
Xt- Both have to be taken into account in the derivatives. 

Several issues need to be addressed. First, we have only considered a stationary 
process of the signals r]. In applications, it is useful to consider nonstationary rj 
that have time-dependent transition probabilities. Moreover, it is useful to consider 
various types of interactions between rjt and Zt besides the additive one. 

Second, in Example 1.1, each ijt is binary, indicating whether a null is true 
or false. For more generality, one can assume a finite state Markov chain, such 
that a subset of the states are associated with true nulls and the rest with false 
nulls. Even for a binary process, it can be useful to reformulate it as a multistate 
Markov chain. For example, let r/ be a second order binary Markov chain, i.e. 
Pr {rjt \ r]s,s <t} = Pr {rjt \ rjt^i , rjt^2} ■ Then one can define a first order Markov 
chain fj hy rjt = {rjt-i^rjt). If rjt = 1 {Ht is false}, then in terms of (0, 0) and (1, 0) 
are states associated with true nulls, and (0, 1) and (1, 1) are states associated with 
false nulls. 

Third, in Example 1.1, limit operation, differentiation, and expectation are freely 
interchanged for Pr {rjt \ X-rm ■ ■ ■ ,Xn} for fixed t. This has to be justified. Note 
that the likelihood bears similarity to Pr{??„ | Xq, . . . ,Xn}, a quantity extensively 
studied in the literature on nonlinear filtering and related issues [1; 2; 3; 7; 8; 9; 10; 
11; 15; 17; 18; 19; 20; 28]. As in most of the cited works, in this paper, convergence 
results are established using geometric contraction. On the other hand, in those 
works, the goal is to establish weak convergence of the conditional probability for 
rjn under the assumption of stationary transition probabilities. As seen in Example 
1.1, the convergence of the conditional probability for rjt follows from the martingale 
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convergence. So instead, the goal here is to estabhsh convergence for the derivatives 
of the conditional likelihood with arbitrary transition probabilities. 

The rest of the paper proceeds as follows. In Section 2, a HMM is set up in the 
context of multiple testing and then various convergence results on the likelihood 
ratio are stated. In Section 3, the likelihood ratio for a first order HMM with binary 
states is considered in more detail, which allows more explicit expressions for the 
first and second derivatives of the likelihood ratio. Several examples are given, with 
Example 1.1 being a special case. Theoretical details are provided in Section 4. 



2 Main results 
2.1 A HMM setup 

Let 7] = {r/t, t G Z} he a finite state process, such that the state space Ti. is par- 
titioned into TCq and Tii, with states in TCq being associated with true nulls, while 
those in TCi associated with false nulls. The noise process is Z = {Zt, t G Z}, with 
each Zt taking values in a Euclidean space Z. To model the interaction between ijt 
and Zt, let {ip{z,'d), •& G 0} be a family of mappings Z ^ X indexed by where 
is an open set in R'^ and X a Euclidean space. Then, let 

0a : MP ^ e, a en 

be a family of functions, such that each e E M*' specifies a scenario where the 
observations are 



Xt = Xt{e)=v{Zt,er,Ae)). (2.1) 

Intuitively, ip{Zt,'d) determines how Zt interacts with a possible manifestation of 
r]t to generate an observation Xt] the manifestation of rjt is 6r^^{£), with e being the 
tuning parameter that determines how strongly r]t manifests itself. The dimension 
p of £ may be greater than 1 to take into account different aspects of the tuning. 
We will assume that [r], Z) is defined on the canonical space Ti'^ x Z'^ equipped with 
the product Borel cr-algebra. 

For function /i : M** — > M and s-tuple of nonnegative integers v = (z^i, . . . , i^^), 
denote the z^-th derivative of h and its order respectively by 

^ ^^) = 5xr---9x-' H = -i + ---+-- 

Denote h^"^ ■= h if = := {0, . . . ,0). For G N, denote h G C^'?) if h^"'^ exists and 
is continuous for every |z^| < (7. If i = (ii, . . . , ig) and z/ = (z/i, . . . , Vg), denote i < 
if ifc < i/fc for every k = 1, . . . , s and denote i < 1/ if i < v and v. 
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Assumptions We have already assumed that rjt takes values in a finite set 7i. In 
addition, different subsets of the following assumptions will be needed for different 
occasions. 

1. Z is independent of i] and Zt are iid such that for each G and t G Z, 
(p{Zt,'d) has a density f{x,-d). 

2. r/ is a Markov chain and there are k > 1, > 0, such that for all a, b £ TC 
and s, t £ with | s — 1 1 > k, Pr {r/t = 6 | t^^ = a} > 0* . 

3. For each z £ Z and a, b £ Ti, < f{Lp{z,9a{s))-,0b{s)) < oo and is continuous 
in e. 

4. There is G N, such that for each z £ Z and a, b £ TC, f{^{z, 9a{£)), Obi^)) as 
a function in e belongs to C^''^ and all its partial derivatives of order < q are 
continuous in (z,e). Furthermore, for r > 0, there is c = c(r) > 2, such that 

Pr{M,(Zo,r) > n} = O ((logn)-'=) , n ^ cx) 

where, letting 4,ab(e) = In 6'a(e)), 6'b(e)), 

M,(z,r) = sup I : 1 < l^^l < 9, kl <r, a,6G w}. 

5. For any r > 0, £^[Mq(Zo, r)'^] < oo, where k = q^{q + l)/2. 
Henceforth, for s,t£'L and a, b £ 7i, denote 

Pt{a) = Pr {ijt = a} , Pst{a,b) = Pr {r]t = b\r]s = a} . 

Remark. 

1) In Assumption 2, r] need not be stationary or have time-homogeneous tran- 
sitions. 

2) Assumption 3 implies that no value of Xt can decisively exclude some elements 
in 7i while including others as possible values for r]t. 

3) In Example 1.1, since 4,a6(e) = -^[2; + e(a - 5)]^ - In \/27r and Zt ~iV(0,l), 
the HMM satisfies Assumption 5. The assumption is stronger than Assumption 4. 
To get results on almost convergence. Assumption 4 suffices. However, to get results 
on expectations, Assumption 5 will be used. 

4) Assumption 2 can be relaxed as follows: there are > and ... < < 
tk < Sfc+i < . . ., with Sk — > ±00 as /c — > ±00, such that ^5^,4^(0, 6) > (p* and for 
n ^ 1, #{k : —n < Sfc < 0}/n and #{A; : < < n}/n are bounded away from 0. 
The analysis under the relaxed assumption follows the same line as the rest of the 
paper but is more technical. We will not pursue it here. 
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2.2 Asymptotics 

Given e and m, n £ N, if the observations consist of Xs{£) = ips{Zs,Orj^{e)) with 
s = —m, . . . ,n, the hkehhood ratio for false null vs true null at t is 

. ^ ^ Pr{Vteni\X_m{e),...,Xn{e)} 

By Bayes formula, 

Z-aeHo Pt{a)E^[Us=-mM^,'^s) kt = a] 

where o" = {at) is an independent copy of r] and is independent of Z as well, Eo- 
denotes the expectation with respect to a, and for c £ 7i, 

Me,c) = f{Xt{e),ec{e)) = f{^{Zt,er„{e)),9c{e)). (2.3) 
As discussed in the Introduction, 

Pr{i]teni\Xs{e),seZ} 



pt{e) = lim pt,mn{e) 



m,n— >oo 



Pr{??t G?^o|^s(e),sGZ} 



exists almost surely due to martingale convergence and plays an important role in 
optimal multiple testing procedures. 

Theorem 2.1 Suppose Assumptions 1-4 hold. 

1. Almost surely, pt^mn £ C^'^^ for t = —m + k, . . . ,n — k. 

2. Almost surely, pt(e) is strictly positive for all t and e. 

3. There is a deterministic function rt^ui^o) £ (0, 1) in Eq > for each t £ Z 
and V with \iy\ < q, such that almost surely, as m, n ^ co, p^tmrX^) converges, with 



sup 



' m.n^oo ' 



oirrrieo)), 



for all t G Tj, v with {i^l < q and Eq > 0. 



Due to the uniform convergence of Pt^^n every compact set, 

PteC(^), pi-He) = lim p'(l^{e), te'L,\u\<q- (2.4) 

(cf. [24], Theorem 7.17). Since pt{£) are strictly positive, the interchange between 
limit operation and differentiation for their logarithms in Example 1.1 is justified. 

Since Z is countable, in order to establish Theorem 2.1, it suffices to show it 
holds for each fixed t £ Z. Without loss of generality, we shall focus on t = 0. For 
ease of notation, henceforth denote pmn = Po 
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By the conditional independence of {at,t < 0) and {(7t,t > 0) given (Tq, 



V'o(e,o"o)E<^ 



.s=l 



JJ^/^_3(e,cr_s) 



Fix an arbitrary i £ TC. Define 



Ea [nr=i V's (e, cr±s ) I = 

Then (2.2) for t = can be written as 



(2.5) 



(2.6) 



From (2.6), it is not hard to see that Theorem 2.1 follows from the next two 
results. 

Theorem 2.2 Let Assumptions 1-3 hold. Almost surely, as n ^ oo, for all 
a £ 7i, A„_a(e) and A^n^ai^) converge uniformly on every compact set of e. The 
limits 



U(e) = lim A„,a(e), U(e) = lim A_„,a(e) 



(2.7) 



are strictly positive and continuous, and there is a deterministic increasing function 
r(eo) £ (0, 1) in Eq > 0, such that almost surely, as n ^ oo, 

sup \AnA^) - La(e)| = o(r(eo)""), Veo > 0, 

|e|<eo 

and likewise for A-n,a o-nd L(j(e). 

Theorem 2.3 Let Assumptions 1-4 hold. Then almost surely, as n ^ oo, for 
each nonzero v with \v\ < q and a £TL, An)i, A^^^(e) converge uniformly on every 
compact set of e. Let 

U^a{e) = lim A(^)(e), L,,,(6) = lim ALl.,(e). 

There is an increasing deterministic function ry{eQ) G (0, 1) in Eq > 0, such that 
almost surely, as n ^ oo, 



max sup 

" \e\<eo 



A^:l{E)-U,a{e)\=o{r:{Eo)), Veo > 0, 
and likewise for A-n,a o.nd Ly^ai^)- 
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Basically, the two theorems mean that La(e) and La(e) are q times differentiable, 
and for v with \ v\ < q, 

LM(e) = U,,(e), LM(e) = U,,(e), 
that is, (lim A±.„^a)('^) = limA^^^. As a result, p{e) is q times differentiable, with 



EasHi a)Po{a)la{e)la{e) ^''^ 



(2i 



In Example 1.1, we derived E[(lnp)('')(0) 1 7?o] by freely interchanging limit oper- 
ation, differentiation, and expection. The next result implies this is correct. 

Theorem 2.4 Let Assumptions 1-3 and 5 hold and k = 1 in Assumption 1. 

1. There are < c < C < oo, such that almost surely, c < A„^(i(e) < C for all 
n ^ 1, a £ 7i and e, thus 

E[lnU(e)] = hm E[ln A„.,(e)]. 

n— >oo 

2. For V with 1 < q and a £ TC, 

E[lnU(e)]H = E[(lnU)H(e)]= lim E[(ln A„.jH(^)], 

Similar results hold for ^.-n,a and La. 

3 Binary state HMM with univariate parameters 

In this section, we consider in more detail the case where is a first order binary 
state Markov chain, with rjt = 1 {Hf is false}. Also, we suppose e G M and 

^o(O) = ^i(0) = 0, (3.1) 

i.e., at e = 0, false and true nulls are no more distinguishable based on the data. 

To find out how the likelihood ratio behaves when the signals are weak, we shall 
derive explict form of their derivatives at e = 0. We shall focus on the likelihood 
ratio at time t = 0. Analysis for other t can be done likewise. 

3.1 Derivatives of likelihood ratio 

Recall that if we only evaluate the likelihood ratio based on Xq , then the value is 

Pr{r?o = l|Xo} _ Po(l)^o(e,l) 



Pie) 



Pr{r?o = 0|Xo} Po(0)^o(e,0)' 
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where for t £ Z, ipt{£:,a) = f{Xt,9a{e)) = f{ip{Zt,6r^^{e),0a{e)). Comparing to 
(2.8), the Hkehhood ratio p{e) based on the entire observations satisfies 

ln^ = r(e)+r(e), with r(e) = In r(e)=ln^. 
p{e) Lo(e) Lo(e) 

Therefore, the effect of dependence is characterized by r(e) and r(e). 
We shall focus on r(e). The treatment of r(e) is similar. Recall 

r(.) = lim A„(.), with A„(.) = In l^flH^TT^TT^ ' 
By (3.1), for t G Z, 

M0,cTt) = f{ip{Zt,9^M),0aA0)) = f{v{Zt,0),0) (3.2) 
independent of a, so A±.„(0) = 0, giving r(0) = 0. Next, define 

dt(e) =lnVt(e,l) -lnVt(e,0), Dst = Pst{l,l) - PstiO,l), s,teZ. 

Theorem 3.1 Let Assumptions 1-4 hold. Then 

CO 

r'{0) = Y,Dotdm (3.3) 
t=i 

oo oo 1 

r"(0) = J2Dot {4'(0) + [PotihO) - PotiO, 1)][<(0)]^} + 2^4(0) J^n,,, (3.4) 

t=l t=l s=l 

where ' , " , . . . , denote differentiations with respect to £ and for 1 < s <t, 

ust = Dos[D,tPosiO,0) - Dot]d'M + DosDsti'siO,0) - DotE,[4(0, ci,) | ao = 0]. 

The expressions of r'(0) and r"(0) are much simpler when ij is time homogeneous 
and stationary, with pa = -Po(o) £ (0, 1) and transition matrix 



Q 



1 - Poi Poi 
Pw 1 - PlO 



In this case, 



Po = ^^, Pi = ^^, Q=(]){P0,P.) + r(P'){l,-l), (3.5) 
with r = 1 — poi ~ Pw £ (~li !)• Then for any t > 1, 

Q'=(]] (P0,Pi) +r'(P') (1, -1) - . 
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As a result, 

oo 

r'{0)=^M{0), (3.6) 
t=i 

oo 

r"(0) = J2r' K(0) + (Po - r'UiO)?} 

t=i 

oo t—l 

+ 2{po - Pi) E ^*<(0) - (3.7) 
3.2 A univariate case 

In this section, we suppose both Xt and Orjti^) are univariate, and the fohowing 
regularity conditions are satisfied: 

as a function in v belongs to C^^^ , such that for any 
■d, V, and 1/ with \u\ < 2, E[X^''\(p{Zt,v),^)] = {E[X{(p{Zt,v),^)])^''\ where the 
differentiation is with respect to v and 

2. 9a{e) G for any a G 
Proposition 3.2 Zei Assumptions 1-4 hold. Then for each t, 

<(0)^2[.U0)-.^(0)]<^(0)^!Ag|^^ 

+ mf - o'oiof]^^^§^ + [em - om^^^, m 

where Xt = ip{Zt,0) has density f{x,0). 

Proposition 3.3 Let Assumptions 1-3 and 5 hold. Then 

E[r'(0)h]=0, (3.10) 

oo 

E[r"(0) 1 7?] = Var[(i'o(0)] Dot[2vt - Pot{h 1) - Pot{0, 1)], , (3.11) 

t=i 

and in particular 

oo 

E[r"(0) I m] = (2% - l)VarK(O)] J2 ^oV (3-12) 

t=i 

Moreover, Var[(iQ(0)] = [^^(O) — Oq{0)]'^ J (0) , where J(t?) is the Fisher information 
for the parametric family f{x, •&). 
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3.3 Examples 



Example 3.1 (Translation) Suppose (p is defined on M x M sucli that <p{z,v) = 
z + V and for a = 0, 1, 6a{£) = £a- Let eacli Zt liave density h{z) = e~'^^^\ 
Apparently, Example 1.1 belongs to this case. 

For each -i? G M, ip{Zt,i!)) = Zt + has density f{x,-d) = h{x — -d). Therefore 
X{x,i}) = lnf{x,i}) = —V{x — "&)■ It is easy to check 

Provided necessary conditions are satisfied, by (3.8) and (3.9), 

<(0) = V\Zt), <(0) = {2r^t - l)V"{Zt). 

Then r'(0) and r"(0) can be calculated by Proposition 3.2. Since Var[(iQ(0)] = 
/ l/'(x)2e~^(^) dx, E[r"(0) | r/] can be calculated by Proposition 3.3. □ 



Example 3.2 (Scaling) Suppose f is defined on M x M such that ip{z,v) = e~'"z 
and for a = 0, 1, 9a{£) = £a. Let each Zt have density h{z) = e~'^^^\ For G M, 
ip{Zt,v) has density f{x,v) = e'"h{e^'x). Therefore, A(x,u) = v — V{e'"x). By (3.8) 
and (3.9), 

<(0) = 1 - ZtV'{Zt), <(0) = {2r,t - l)Zt[V'{Zt) + ZtV"{Zt)]. 

Then r'(0) and r"(0) can be calculated by Proposition 3.2. Since Var[(iQ(0)] = 
/[I — xV'{xy\^e~'^'^^^ dx, E[r"(0) | r/] can be calculated by Proposition 3.3. □ 



Example 3.3 (t-statistics) Suppose the data observed at each time point t is a 
random vector ^t = {Ct,i, ■ ■ ■ ,£,t,u+i), such that conditional on rj, ^t are independent 
of each other, and for each t, ^t,j are lid ~ N{£r]t,St) for some st = st{r]) > 0. 
Suppose St are completely intractable, i.e., there is no information on the values of 
St or their interrelations. In this case, it is reasonable to use the t-statistics 

for the tests on rjt, where ^t is the mean of ^t,j and St is the sum of squares of 
^t,j - 6- 

Let Ct = a/i^ + 1 (6 — erjt). Given 77, Ct ~ 1) and Sf ~ xl are independent 
of each other. Define Zt = {Ct, St) and, for z = (r, s) and a = 0, 1, define 

(f{z,v) = ^/l'{r + v)/s, Oa{e) = \/y + las. 
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Then Xt = y/i'iCt + Vi^'+^m^) / St = (p{Zt,6rjt{e)). Conditional on r/, Xf ~ t„^^{x) 
with -d = 6rj^{e), i.e., the noncentral t-distribution with u degrees of freedom (df) 
and noncentrahty parameter In terms of Assumption 1, f[x,'d) = t^^^{x). 
Recall 



Therefore, 



(zy + 2;2)(^+i)/2 



with Cy 



v/?r(|) ■ 



CkX 



k 



with Cfc 



r(i^+|+i)2'=/2 



A(x,??) =ln/(x,??) = Int^(x) - i??^ + In 



By ln(l + x) = X - + ix^ 



CfcX" 



A(x,i9) 
It follows that 



ClX 



+ 

Vz/ + x^ 2 



1 r (C2 - Cf )x2 



z/ + X^ 



1 ^i?^ + lnij.(x) +0(i?'' 



OA(x,0) _ cix S^AOM^ 
(92a(x,0) _ (c2 - cl)x^ 



(l, + a;2)3/2 

1. 



5'(?2 1/ + x2 

At £ = 0, = y/PCt/St. Since 6';,(0) = V^TTTa, (3.8) yields 



<(0) 



V^cix _ y2(I7TT)r(f + 1)0 



rm7cFTs2 



Next, since 



by (3.9), 



<(0) 



dv St ' 



2c,(. + l),,52 

(52 + C|)3/2 



(C2 - C^)C t^ 

5? + Ct 



1 



Then r'(0) and r"(0) can be calculated by Proposition 3.2. 

To apply Proposition 3.3, we need to check if Assumption 5 holds. It is not 
hard to see that for g{e) := X{ip{Zt,9a{e)),9i){£)), g^^'^s) is a linear combination of 
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dxi 



evaluated at x = ip{Zt, Oai^)) and •& = 9b{£)- It is also not hard 



to see that ai^'^^ and 



a''-^\(x,-&) 



are bounded, so as long as E[5; 



-i9'(g+i)/2i 



< oo 



for j < q, Assumption 5 holds. Since here q = 2 and ~ xti it suffices to have 
I' > 12. Under this condition, 



VarK(O)] 



V2F+i)r(| + i) 



- 2 

E 


\ a 1 







Because Sf is the sum of squares of v iid A^(0, 1) random variables that are inde- 
pendent of ~ -^(0, 1), by symmetry. 



1 



VarK(O)] = \ 



Then E[r"(0) | rj] can be calculated by Proposition 3.3. 



□ 



4 Proofs 

4.1 Some inequalities 

For any set A, denote by the number of its elements. 

Lemma 4.1 Let TC be a finite set and Wa > 0, 14 > for a £ 7i such that 
W := Wa > and V := Va > 0. Then for any vectors Xa, a £ 7i, 



< max \xa — Xhl 
a,beH 



1 



mmaiVa/Wa) 



ma-Ka{Va/Wa) 

Proof. Enumerate the elements in TC in an arbitrary order. Then the left hand side 
equals where 



a,b 



a<b 



D = Y^iWaVb + WkVa) > Y^^WaVb + WkVa) 



a,b a<b 

Denote A = max^^b \xa — Xf,\- Then 

\T\ ^ ^Ea<b\WaVb-WbVa 



< Am ^"^fe - ^bVg 

D Ea<b(WaVb + WtVa) " WaVb + W^Va 



A 



1 — min 



2Va/Wa 



a,b Va/Wa + Vb/Wb 



< A 



maXa{Va/Wa) 



□ 
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Lemma 4.2 Let A and B be finite sets and Wa, Va, a^a > for a ^ A^J B. Then 



<#Bx 



V miiiae^ Xa J aeA^b&B 



max 



Wa Va 



\T\ 

Proof. The left hand side equals where 

T= ^aXb{WbVa-WaVb)= XaX^WaVal^-y 

aeA,beB aeA,b&B ^ " 

D= Y XaXa'WaVa' > irmUXaj WaVaXa- 
r, nia & ^ ^ a<^A 



a, a'GA 

Then the lemma follows from 



1^1 < I maxx;, ) max 
\beB J aeA,beB 



W!^_Vb 
Wa Va 



a€A 



□ 



Lemma 4.3 Let H be a finite set and g G N. For a £ H, let Wa ■ — > [0, oo) and 
ga : W ^ be q times differentiable. Suppose W := J2aWa > 0. Define function 
g = Waga- Enumerate 7i in an arbitrary order. Then for u with \ v\ = 1, 

-g{u) ^ ^-1 ^ ^^giu) ^ ^-2 Y^iw^-)Wb - WaWi^'^ga - gb), (4.1) 

a a<b 



and more generally, for v with < q, 



\u\+l 



a k=2 Q<j<u 



(4.2) 



where Uk v j can be written as 



= ^ Ck,u[ai, ■ ■ ■ ,ak,ti, ■ ■ ■ ,ik 

ai,...,afcGW, ai<a2 s=l 
^ik=^-3 



with Ck,u{ai, . . . , ak, ii, . . . ,ik) being constants. 
Proof. If |z^| = 1, then 

= W-' Y + W E ^^''^9a - W-^ Y ^-9a E 

a a ah 

= W-^ ^-a'--^ + W-^Y^^a^^b - WaWl''^)ga 
a a^b 

= W-' Y "^-^-^ + E(^i'^^^ - ^a<'^)(5a - 9b). 

a a<b 
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showing (4.1), and hence (4.2) for = 1. Let u = e + ii, where |e| = 1 and 
< i-i < u. Suppose h^^^ has the form (4.2). Then 

k=20<j<u 

where / = E Wafa, with /, = gi^\ By (4.1), 

a a<b 
a a<b 

On the other hand, for each k = 2, . . . ,\i/\ and < j < u, 

aeH 

It is then not hard to see that g^'^^ has the form (4.2). The proof is complete by 
induction. □ 



4.2 Basic facts 

Define matrix-valued functions Ln{e) = {Ln,ab{£)^cL,h G 7i)^ such that for n > 0, 



L 



±n,ab 



0-0 = 



(4.3) 



Then from (2.5), 



An,a(e) 



^beV. ^n,ib{^) 

For ease of notation, when there is no confusion, e wih be omitted. 



(4.4) 



Lemma 4.4 Let Assumptions 1-4 hold. Then for each n and a, b £ 7i, Ln^ab £ 
C^'^\ and for \n\ > k, Ln^ab is positive and finite. 

Proof. By Assumption 4, ^/;„(e,a) G C^'^^ for each n G Z and a G 7i, implying 
L±n,ab S C^''\ For n> K and a, b G H, as Pon{a,b) > 0, there is at least one v = 
{vi, . . . , Vn) with Vn = b and Pr {cJi = "yi, . . . , cr„ = | ctq = a} > 0. For each such v 
and t = 1, . . . ,n, by Assumption 3, ipti£,vt) G (0,oo). Therefore, Ln^abi^) G (0,oo). 
The proof for L_n ab is similar. □ 
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According to the Lemma, A^^a G (0, oo) once |n| > k. Also, by Assumptions 2 
3, Po{a) > 0, 'tpo{e,a) > 0. Therefore, 

Pmn 

{e) £ (0, oo). 
The fohowing relation will be repeatedly used. 



Ln,ab = 1pn{e, b) ^ Ln^k^ael^lb^ a,b £ H, 1 <k <n 



(4.5) 



where 



Ak) _ j{k) 



n,eb n,eb 



n-1 

l{an = b} ^t(e,crt) 

n-k+1 



(^n-k = e 



(4.6) 



Similar relation holds when n < 0. 



4.3 Proof of Theorem 2.2 



Lemma 4.5 Lei Assumptions 1-3 hold. 

1. Given a, b £ Ti and e, for \n\ > k, miug f "''"'I'"! is strictly positive and 

increasing in n, and maxg is finite and decreasing in \n\. 

2. There is an increasing deterministic function /"(eg) G (0,1), such that given 
Eq > 0, for almost all realizations of Z and rj, 



A„(e) := max 

a,b,c,d 



<Cr(eo)'"l, \n\ > k, |e| < Eq, 



(4.7) 



where C = C{eQ,Z) is a random variable that only depends on Eq and Z and is 
finite almost surely. Additionally, for fixed e, A±n{£) are decreasing in n. 



Proof. We only consider n > 0. The case n < is similar. Given a ^ b £ 7i, for 
n > K and c G by Lemma 4.4, 



^e(0,oo). Then by (4.5), 



-'n,bc 



e ^n—k,be-'-n,ec 



[k) 



Letting A; = 1, it is easy to see that 



(4.8) 



mm — < — < max ■ 



n—l,ae 



L 



n—l,ac 



L 



all c £ 7i, 



n—l,ae 



which implies part 1. 

Given 1 < k < n and e, for each a, b, c, d £ 7i, apply Lemma 4.1 to Xe 



-'n — k,be 
''n — k,ae. 



We = Ln-k,aeln,ec and Ve = L^-k^ael^'ed- ^hen by (4.8), 



n,bc 



^n,bd 



L 



n,ad 



< max 

c,d 



L 



n—k,bc 



-'n—k,bd 



I^n—k,i 



Ln- 



k,ad 



■ Ak) ij{k) 
maXe 1^ Jln.ec 
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Take maximum over c and d and then over a and b. It follows that 

A„(e) < 7„A„_fc(e), with 7„ = 7„(e, A;) = 1 ^~(^) (fcf • (^'9) 



ec 



For z = (zi, . . . , Zk-i) S 2^'^ , define 



a' 



(z,e)= mm T\ f{ip{zt,9ut{e)),6^,{e)), a*(z,eo)= mf a(z,e), 

ut,vt€H j-^ \e\<eo 

l<t<K~l *=1 



K-1 



P{z,e)= max TT /((/^(zt, 6'„,(e)), 6*^,, (e)), Eq) = sup ^(z, e 



ut,vte'H 

l<t<K-l *=1 



kl<eo 



For n > K, let 



Cn — Cn(eo) — a*(^n-re+l; • • • , -^n-l, Eo); 
■Cn = '?n(eo) = /5*(.^n-K+l, • • • , Eo)- 



Since ^t(e,o-t) = /((^(Zt, 6*^, (e)), (e)), then for |e| < eq, 

n-l 

Cn< n V't(e,<^t) <Cn, (4.10) 

^ Cni^n-K,n(e, c) < li%{e) < ^„P„_«,„(e, c) (4.11) 

Given z G Z''"-^, by #H < oo and Assumption 3, a{z,e) and /3(z,e) are con- 
tinuous in e and < a{z,e) < f3{z,e) < oo, yielding < a^{z,eo) < (3*{z,eo) < 
oo. As a result, Pr{0 < Cn < < oo} = 1. Fix < x < y < oo, such that 
Po := Pr{x < Ck < Cft < y} > 0. Note that x and y can be chosen in such as 
way that they only depend on eq, the distribution of Z, and k. Because Zt are 
iid, from the definitions of Cn and almost surely, there is an infinite sequence 
Us = ns{Z,eo) > K, s > 0, such that 

X < Cn. < Cn. < y (4.12) 

and furthermore, can be chosen in such a way that 

Us > ng-i + K, \{s : Us < n}\ > for n ^ 1. (4.13) 

2k 

On the other hand, since i^H > 1, Assumption 2 implies that 

(p* < i^n-K,n(e, c) < 1 - 0* all c,e€n. (4.14) 
Combine (4.11), (4.12) and (4.14) to get 

< < i^f^M < (1 - <^*)y < oo, yc,e£n 
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and hence 



< 1 



<ro = ro(eo) := 1 



(1 - 4>*)y. 



< 1. 



Now by (4.9), A„^(e) < A„^_K(e)ro. Since Ug-i < Ug — k while (4.9) imphes 
that An{e) is decreasing, A„^(e) < A„^_-^(e)ro and hence A„^(e) < A„^(e)ro~"^ by 
induction. For any n, if < n < n^+i, then A„(e) < A„^(e)rQ~^ < AK;(e)rQ~"'^. 
Combining (4.13), for n ^ 1, 



PO 



An{e) < [A,(e)/ro]r(eo)", with r(eo) =^0^". 

Notice that A«(e) < maxa,b,c LTbti) ' (4.3) and (4.10) followed by As- 

sumption 2, it is seen that 

LK,ac{£:) / L -Pok(^,c) ^ (l-</'*)^K 



max 



< — max ■ 



< 



< oo. 



Therefore, (4.7) is proved. 

To make r(eo) increasing, replace r{eo) with, say, [infc>eo '"(c) + l]/2- From the 
construction, r(eo) only depends on the distributional properties of Z and rj, but 
not specific realizations of the processes. Therefore, r(eo) is deterministic. □ 



Lemma 4.6 Fix a £ 7i and e. 

1. For a en, 

< inf A„,a(e) < sup An,a(e) < oo- 



\n\>K 



\n\>K 



2. For s > n > K, and s < n < —k, 

|A„,«(e) - K^a{e)\ < 2An{e) + A,(e). 
Proof. From (4.4), for s > n > k and s < n < —k, 



mm —— , max — - 



Together with part 1 of Lemma 4.5, this yields the first part of the lemma and also 



An,a(e) 



where b £ 7i is arbitrary. Then by 



< A„, 



< 



|A„,a(e) - A,,a(e)| 
A„,.(.) ^"'"^^"^ 



+ 



A.,a(e) 



Ls,ab{^) 



the second part of the lemma follows 



Ls,ib{£) 



+ 



Ln,ab{^) Lgabi^) 



Ln,ibi£) Ls^ibis) 



□ 
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Proof of Theorem 2.2. From Lemmas 4.5 - 4.6, it is seen that given eq > 0, almost 
surely, as n ^ oo, An^ai^) La(e) and A_„^a(e) La(e) uniformly for |e| < eo, at 
rate o(r(eo)")- Since A-|-„^a(e) are continuous, the uniform convergence implies that 
\-a{e) and La(e) are continuous. Also, the lemmas imply that La(e) and La(e) are 
strictly positive. By monotonicity argument, almost surely, the convergence holds 
simultaneously for all Eq > 0. □ 



4.4 Proof of Theorem 2.3 

For i 7^ 0, n > 1 and Eq > 0, define 



V±n(eo) = n max -D±t(eo), 

l<t<n 



with Dt{eo) = max max sup 

\u\<q a£H |£|<£Q 



4''\e,a) 



(4.15) 



where ijj^'^ is a derivative with respect to e. Note Dt{£o) > 1 since the maximization 
in its definition takes into account = 0. 



Lemma 4.7 The following statements are true. 
1. For £q > and n > 1, 

Vnieo) < nmax[q + My{Zt,eoW. 

\t\<n 



(4.16) 



2. If Assumptions 1-4 hold, then Pr{lim„/3 "'K(eo) = 0, V/3 > l,eo > 0} = 1. 

Proof. To show part 1, it suffices to show that for all with |z^| = / < q, and all 
eo > and t / 0, 



duti^o) '■= ™ax sup 

"■^'H \e\<eo 



< [l + MiiZt,eo)f. 



(4.17) 



It is easily seen that (4.17) holds for / = 0. Suppose (4.17) holds if |z/| < I. 
Let = Z + 1. Without loss of generality, let u = e + /j,, where e = (1, 0, . . . , 0) 
and fi = {fii, . . . , fip) > 0. Let iz,ab{£) = In /((/?(z, ^a(e)), ^^(e)) as in Assumption 4. 
Then -^^(^^ ' 



byVr(e,a)=^i(e,a)4^/_^^^Je), 



E 

i<li 



i 



where (^) = (^) • • • 5) • 
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For i < 1^, \(-^2t ma(^)l — ^^li^t^^o)- Then, as = I, by induction hypothesis, 



max sup 



#)(e,a) 



V't(e,a) 



< Mi{Zt,eo)Y,{^\\i\+Mi{Zueo)]\ 
<Mi{Zt,eo)Y,(fl [l + Mi{Zt,eo)f 



= Mi{Zt,eo)[\u\+MiiZt,e)]\ 

which imphes (4.17). By induction, (4.17) holds for all |z^| < q. 

Because Vn{so) is increasing in so, to show part 2, it suffices to show (4.16) for 
each fixed eo > and j3 > 1. Fix an arbitrary c € (l,/3), such that c'^ < (3. By 
part 1 and Assumption 4, for some p = p{eo) > 2, 



Pr{Vn{eo) > nd>'^] = Pr <^ maxMg(Zt,eo) > ^ 

< 2nPr{Mg(Zo,eo) > c"} = o(n 



Then part 2 follows from the Borel-Cantelli Lemma and nd^"^ = o{(5'^). 



□ 



Lemma 4.8 Let Assumptions 1-4 hold. Fix a, b, c £ H and k > 1. Let 



Wn{e) = Ln-k,ah{£)Inljy^)^ n > k, 



where L^^^^ is defined in (4-6). Given v > with \ v\ < q and Eq > 0, for n > 0, 



sup — 

\e\<eo ^n,ab\£) 



with Vn{£o) '■= if n = 0, while for n > k, 



\e\<eo Wn{e) 



<[K-i(eo)]''^'. 



Proof. For ly = (z/i, . . . , Up) with 1 < |z^| < q, it is not hard to get 



llA \-ln = l' t=l 



do = a 



For any sequence li,. . . ,ln in the sum, at most of them are nonzero. For 
each It > 0, \ipt*\e,at)\ < Dt{eo)'ipt{e,o-t) for \e\ < Eq. As a result. 



t=i 



< 



max Dt{eo) 

l<t<n 



\u\ n 

JJ^t(e,crt). 

t=l 
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On the other hand, there are n'^^ ■ ■ ■ n'^^ = n'^^l such sequences. Then 



< 



n max Dtien) 

l<t<n 



n max Dfien) 
l<t<n ^ ^' 



t=l 



do = a 



This completes the proof of the first inequality. To show the second inequality, first. 



n,bc 



{u-i) 



(e). 



Using the definition of I^^l^ and following the treatment for 



n,bc 



max Dt{eo) 

n-k+l<t<n-l 



\u\ — \i\ 



Combining the bound with the one for L!^^_f, ab^^)^ 



max A(eo) 

l<t<n-l 



= [K_i(eo)]'"'w^n(e) 
This finishes the proof. 



□ 



Lemma 4.9 Lei Assumptions 1 ~ 4 hold. Define, for v with |z^| = 1, . . . , 



A„,i.(e) := max 

a,b,c,d 



L 



n,bc 



n.bd 



-'n,ad 



(^) 



(4.18) 



Then for each u, there is an increasing deterministic function < rjy(eo) < 1 m 
eo > 0, such that almost surely, as n ^ oo, 

sup A„^,,(e) = o(rj,(eo)'"'), all Sq > 0. 

|e|<£0 

Proof. We only consider n > 0. The case n < can be handled similarly. Given 
k, define In%{e) as in (4.6). Given a y^b €H, write VFn.ec(e) = Ln-k,ae{£)In}c{e), 
WnA^) = Ee^n,ec(e). Then by (4.8), for n > k, 



■'n—kfie 



■^n,bc 

n,ac „ ^n—k,ae 
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Fix I = 1, . . . ,q. By Lemma 4.3, for 7^ with = I, 



Ln—k,be 
I-'n—k,ae 



+ R. 



n,u,ci 



(4.19) 



where 



Rn uc = ^ hnear combination of 



n vyn,esC 
. Wr,..r. 



L 



n—k,bei 



L 



n—k,aei 



(J) 



n—k,be2 



L 



n—k,ae2 



across m = 2, . . . , \i>\ + 1, ii, . . . ,im > 0, < j < u with ii + • • • + im + j = i^, and 
ei, . . . , Cm £ W with ei < 62- Then, by the same argument that leads to (4.9), 



{e) + 2 max \Rn,v,c{£)\, 



(4.20) 



where 7^ G [0, 1] is given in (4.9). 

The rest of the proof is by induction on /. First, let ji'l = 1. By Lemma 4.3, 



ei<e2 ^ 



L 



n—k,bei 



L 



n—k,be2 



L 



n~k,ae\ -^'n— fc,ae2 



Fix eo > 0. By Lemma 4.8, for |e| < eq, 

\W^%{e)\ < Vn~l{eo)Wn,ec{^), 

and therefore, 

max|i?„,i,,c(e)| < 14-i(eo)A„_fc(e) 



(4.21) 



(4.22) 



By Lemma 4.5, there is increasing deterministic r = r(eo) S (0)1); such that 
sup|^l<^,j A„(e) < r" for n > 1. Fix /? G (1, 1/r). Then by Lemma 4.7 and (4.21), 
for n ^ 1 and |e| < eo, 

A„,.(£) < 7nA„_fc,,(e) + /^"r"-'^ < A„„fc,,(e) + /?"r"-^ (4.23) 

Let /c = 1 to get A„^^(e) < /S.n~i,v{£) + /3"r"~^. So by induction, for s < n, 

n-l 



(4.24) 



t=s 



Next let k = K. By the same argument that follows (4.14), r can be chosen in 
such a way that there is a sequence Ug = ns{Z,£Q) that satisfy (4.13) and < r. 
By the first inequality in (4.23), for s ^ 1, 



A„,,.(e) < rA„^_,,,(e) +/3"=r"»- < rA„^„«,,(e) + (/3r) 



ns-i 
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Let n = Us — K and s = Us-i in (4.24) and combine it with the last equahty to get 
where c = (3'^ + — r(3). Then by induction and the fact that Ug > ks, 



t=i 

t=i 

Now for any Ug < n < Us+i, by (4.24) and the above inequahty, 

Since for s 1, s + 1 > l^n^+i > it can be seen that An^i^{s) = 0{c^)^ with 
c = (/3r)2K < 1. Since /3 E (1, 1/r) is arbitrary, it follows that for a given and 
any ri > := r^^, say n = ri(eo) = (1 +r^)/2, sup|£|<£Q A„,j,(e) = o(r") almost 
surely. By monotonicity, it can be seen that almost surely, the exponentially fast 
convergence holds simultaneously for all Eq. 

Now let |z^| > 1. To bound Rn,u,c{£), for s = 2, . . . , + 1, and p-tuples of 
nonnegative integers, ii, . . . ,is, j, h + • • • + is = — j < i^, and ei, . . . , G "H, by 
Lemma 4.8, for |e| < eo, 



Wi%l---Willl\ < n [Vn-l{eo)f''\Wn,e,c < T^„% [K-1 (^o)] ' 



l^n-iy^UJi ''•'n.efeC ^ " n,c I'' n-l\^\J J j 

k=l 

SO in place of (4.22), we have 



max \Rn,uA^)\ < [K-i(eo)]''' x A„_fc,,(e), (4.25) 

where An-kfii^) '■= A„_fc(e) and > is some constant only depending on u. 

Suppose it has been shown that for each j < u, there is rj = rj{eo) G (0, 1), 
such that sup|£|<£Q A„j(e) = o{r'j). Then using (4.25) and following the argument 
for Anj{e) with \j\ = 1, sup|g|<gQ An,u{s) = o{r'^) for some = r^(eo) G (0, 1). By 
induction, the exponential rate of convergence holds for all v with |z/| < q. Again, 
from the construction, only depends on the distributional properties of Z and i] 
and hence is deterministic. □ 

Set A; = 1 in (4.19). For n > k and a, b, c G TC, 

-T- —] - \Rn,u,c\ < [ ] < max '— \ +\Rn,,^,c\, 
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giving 



Lr 



^n-l. 

L 



,bc 



n—l,ac 



< A„„i,^(e) + 2|i?„,^,c(e)|. (4.26) 



Corollary 4.10 Let Assumptions 1 ~ 4 hold. Then almost surely, as s > n — > oo, 



max sup \Rn,u,c{^)\ = o(i/"(eo)) 

"S'W \e\<eo 



max sup 

a,b,c€'H |£|<£(, 



for all Eq > and v with 1 < \v\ <, q, and likewise for L^^ab? where r^{eo) are given 
in Lemma 4-9- 

Proof. The first inequality is already shown in the proof of 4.9. The second one 
follows from summing the inequality in (4.26) over n + 1, . . . , s and applying the 
first inequality and Lemma 4.9. □ 

Proof of Theorem 2.3. Let rjy(eo) be as in Lemma 4.9. For e £ TC, denote 



Then for a€7i, An,a = ^^n^Yle ^n, 



Ln.ac 



and similar to (4.19), 



(4.27) 



where T„ ^ is a linear combination of 



(il) . ..Jin.) 



L 



n—k,ae\ 



'n—k,iei 



(i) 



n—k,ae2 



L 



n—k,te2 



across m = 2, . . . , + 1, < j < ii, . . . , > with ii + • • • + im + j = i^, and 
ei, . . . , Cm G W with ei < 62- Fix any b (z TC. From the above formulas, 



L 



n,ab 



L 



n,ib 



(4.28) 



Following the treatment of Rn.u,c in (4.25), except that we have to use the first 
inequality in Lemma 4.8, it can be seen that 

\TnA^)\ < Cu [K(eo)]'"' X J] A„_fc,,(e), |e| < eo, (4.29) 
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yielding max^^^^^g |Tn^j^(e)| = o{r'^). Now for s / n, by (4.28), it is not hard to get 



< A,,^ + |r,,^| + A„,^ + |r„,^| + 



L 



s.ab 



n,ab 



L 



n,tb 



(4.30) 



Then by Lemma 4.9 and Corollary 4.10, 



sup 

|e|<eo 



Aa(^)-Aa(e) =o(rr(eo)), aeH. 
Since < oo, almost surely, the rate holds simultaneously for all a G "H. 



□ 



4.5 Proof of Theorem 2.4 

Since the parameter k in Assumption 2 equals 1, P„_i^„(a,6) G 

b £ TC and n G Z, with < (/>^, < 1 as in Assumption 2. Consequently, 



. Pn-l,n(e,rf) 

7 = 1 - mi 5 — ; L G 

maXc,d,e P„_i,„(e,c) 



0,1 



1 



for a, 



(4.31) 



For a, eeH, by (4.3), Li,ae(e) = Poi{a,e)ipi{e,e), giving 

- Ve. 



Li,be(e) _ Poi{b,e) ^ 1_ 



Then by Lemma 4.5, 



Li,aei£) Poi{a,e) 



< An,a(e) < 



1 



1 



(4.32) 



(4.33) 



This shows part 1 of Theorem 2.4. To prove part 2, we need several lemmas. 

Lemma 4.11 Fix Sq > 0. Let 7 and 0* be as in (4-31) and a = (p^^ — 1. There is 
a constant C > 0, such that i/ 1 < = / < |e| < £0 o-i^d n > 1, then 

n 

mie) - <'li,ai^)\ < Ca7("-'"^)"^n'('+2) + M,{Z,,eo)f^^^'^/' . (4.34) 

t=i 

Proof. First, by (4.32) and the definitions of A„ and Aum in (4.7) and (4.18), 



Ai(e) = max 

a,b,c,d 



Poi(^c) Poiib,d) 



< 



7(1-'/'*) 



(4.35) 



Poi{a,c) Poi{a,d) 

^lA^) = 0' ^ > 0- 

By (4.6), /gc = Pn-i ,n(e, c), so (4.9) gives A„(e) < 7A„_i(e). Thus, 

A„(e) < a7", Vn>l,e>0. (4.36) 
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Let -Rn,i/,c(e) be as in (4.19) and 

A.„^;(e) = max An,,, (e). 

\u\=l 

Recall the definition of V^(eo) in (4.15). For brevity, write u„ = Vn{eo). By (4.25), 
there are constants q > 1, such that 

max \Rn,u,c{^)\ < ^Qwi_i VAn_i,i(e), (4.37) 

for Z = 1, . . . , n > 1, £0 > and |e| < eo- Then by (4.20), for n > 0, 

A„+i,/(e) < 7A„,z(e) + civ^ ^ A„,i(e). (4.38) 

1=0 

We show by induction that for / > 1 and n > 0, 

i-i 

An+iX^) < aT^^+'-'^'^'nQt;! + nQ<), (4.39) 

i=l 

where A„+i,o(e) = A„+i(e). 

By (4.35), (4.39) holds forn = and / > 1. Let n > 1 next. If Z = 1, then by 
(4.36) and (4.38), 

An+i,i(e) < 7A„,i(e) + ciu„A„(e) < 7A„,i(e) + aj'^ciVn, 

and by induction on n and (4.35), 

n n 

An+i,i(e) < 7"Ai,i(e) + a-f'^ci^Vg = a-f'^ci^Vg < a-f'cxnvn. 
So (4.39) holds for I = 1. Suppose (4.39) holds for 1 < / < A;. By 7 G (0, 1), 



A„,i(eo) = A„(eo) + ^ A„,i(eo) 

j=0 1=1 

{fc-i i-i 

i=l /i=l 



I. i=\ h=l } 

k-1 

aj^^+'-'^^'^'Hil + cnvl), (4.40) 

i=l 
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so by (4.38), 

fc-i 
1=1 

Following the treatment for A„ (e), it is seen that A„^fc(e) satisfies (4.39). As a 
by-product, by (4.37) and (4.40), 

z-i 

max |i?„,.,c(e)| < ia7("-')^^Q^;i_l 17(1 + Ci^i^-i)- (4-41) 
Combining (4.26), (4.39), and (4.41), for any = /, 



(^) - 7 ^ (^) 



< A„_i,i(e) + 2|i?„,^,e(e) 



Let Tn,u be as in (4.27). With (4.39) being established now, by (4.29), we get 
the following bounds similar to (4.41) 



/-I 



max \TnA£)\ < ^a7^""'^'''Q^^i 17(1 + (4-42) 

Combine (4.26), (4.30) and the above inequalities. It is seen that for some 
constants Q > 1, 

|Afc^-A2i,J<Qa7("-'-^)-^n^r^)/^. 
Then applying Lemma 4.7 to Vn = Ki(eo)) the lemma is proved. □ 

Now for n > 1, {A^^M < |Ag(e)| + EL2 l^^n^e) - Ai^2i,.(e)|. Letting k = 1 
in (4.28) and (4.29) and combining them with (4.32) and (4.35), it is seen that 
|A^'^](e)| < C|Vi(e)|l^l, where C is a constant. Together with (4.34), this implies 
there is a constant C/ = C/(7, (p<t:), such that for v with I < = I < q, 

00 

|Ai^l(e)| < QY,PiAQ + Mg{Zt,eo)f^'^'^/\ \e\ < e^, (4.43) 
t=i 

where /3i^t = Sfclt+i T*"^'^'^^^ = ^'((^7)*) for any < c < I/7. 

Part 2 of Theorem 2.4 is an immediate consequence of the next result. 
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Lemma 4.12 Let Eq > 0. Almost surely, the following statements hold for all 
kl < EO; ^ 1 o-''^d V with 1 < < q. 

1. E[(ln A„_a)('^)(e) 1 77] and E[(ln A„^a)(e) | ij]^'^^ both exist and are equal. 

2. Asn^ 00, E[AK(e) | r?] ^ E[Li")(e) | rj]. 

3. Asn^ 00, {E[An,a{e) \ rj])^"^ ^ (E[U(e) | t]])^^). 

Proof. 1. It is not hard to see that (In A„^a)('^)(e) is a linear combination of products 
of the form 

hn,vi,...,uA^) ■■= 7 ' J'fc > 0, Vl^ h l^s = Z^. 

By (4.33) and (4.43), 

s 00 

|/in,.„...,..(e)| <C:=C^E/5M['? + ^5(^^,eo)rl"'''(l"'^■l+')/^ \e\<eo, 

k=l t=l 

with C = C(7,0*) a constant. As |z^fc|(|j^fc| + 1) < I'^KI'^I + 1); by Assumption 5 
and the independence of Zt from each other, < 00. Note that C is independent 
of 77. It follows that (In A„_a)(^)(e) for all n and \e\ < Eq are bounded by a single 
random variable that has a finite expectation and is independent of t]. This implies 
E[(ln A„^a)('^)(e) | r]] exists, and togher with In A„^a G C^''\ implies the rest of part 1 
through dominated convergence. 

2. By Theorems 2.1 and 2.2, A^il(e) converges as n ^ 00 for all e. By (4.33), 
it follows that (In A„^a)('')(e) converges. Then the claim follows from dominated 
convergence. 

3. Consider hn,ui,...,usi^) again. By Lemma 4.11 and (4.33), it can be seen that 

for i^i,...,i^s > Owithz^iH hi^s = 1^, |/in+i,i/i,...,i/s(e)-^n,i/i,...,i/s(e)l < C7"'C holds 

for \e\ < Eq, where C > is a constant and C > is a random variable independent 
of T] with E^ < 00. As a result, E[(ln A„^a)('^^(e) | r/] converges uniformly on each 
compact set of e. Together with part 1, this implies part 3. □ 

4.6 Proof for the binary case 

The following simple identity will be repeatedly used. For any function F on {0, 1}, 
denote dF = F{1) - F{0). Then for s, i G Z, 

E^[F{at) \as = l]- E^[Fiat) = 0] = D^tdF, (4.44) 
F(0) - E^[F{at) \as = 0] = -Pst{0, l)dF (4.45) 

Define for t G Z and n > 1, 

n 

£t{e,a) = lnV't(e,a), 5„(e) =^£t{£,crt). 

t=i 

Then dt{E) = it{E, 1) - £t(£,0), A„(e) = lnE<,[e^-(^) \ao = l]- lnE,[e^"(^) | ctq = 0]. 
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Proof of Theorem 3.1. For n > 1, by (3.2) and (4.44), 

a;(o) = [5;(o) I ao = 1] - [5;(o) 1^0 = 0] 

n n 

= { E, [<(0, at) I = 1] - E. [4(0, at ) ko = 0] } = 5^ Dotd'M- 
t=i t=i 

By Theorems 2.2 and 2.3, letting n — > oo, (3.3) follows. 
To get r"(0), for n > 1, 

a::(o) = E.[5;:(o) ko = i] - e.k(o) = o] 

+ Var,[5;(0) ko = 1] - Var,[5;(0) ko = 0]. 
Similar to the calculation of r'(0), 

oo 

hm {E.[S;:(0) I ao = 1] - E.K(O) | ao = 0]} = ^ Dotd'{{0). 

n—*oo ^ ^ 

t=l 

On the other hand, denoting 6t = i't{0, o-t), 

n 

Var,K(0) ko] = Var^iSt ko) + 2 ^ Cov,((5„ 5* | ^o). 

i=l l<s<t<n 

For 1 < s < t < n, by Ect[5s(5j | (Jq] = Ecr[F((Ts) | fio], with F{crs) = 5sE^[St \ cTs], 

Ea[SsSt ko = 1] - ^aiSsSt ko = 0] = Dq^cIF. 

As 4(0, 1) = 4(0, 0) + 4(0) and E[6t \ as = I] = E[5t\ a, = 0] + Dstd'M, 

dF = 4(0, l)E,(5t ks = 1) - 4(0,0)E<,((5t ks = 0) 

= E^(<5i I a, = 0)4(0) + I),t4(0, 0)4(0) + 15,^4(0)4(0). 

Likewise, 



^a{5s I o-Q = l)ECT(5t ko = 1) - E<x('^s I o-Q = 0)Ect((5j ko = 0) 

= L'o.E,(5t I do = 0)4(0) + D^tE,{5s I = 0)4(0) + i?o.I?ot<(0)4(0). 



So we get 



CoMa{5s, (5f ko = 1) - Covcr(5s, (5t ko = 0) 
= D^s [Ea(5t ks = 0) - E,(5t I = 0)] 4(0) 

+ [l)os^st4(0, 0) - DotEa(5. 1 ao = 0)] 4(0) 

+ DQs{Dst- Dot)d'Md'M- 
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By conditioning on ag, 

E<,(5t|a, = 0)-E<,(5t|ao = 0) 

= E,{5t = 0) - E<,[E,(5t I Us) ko = 0] 

-PosiO, l)[E^{6t \(Ts = l)- E^{6t I as = 0)] 

-DstPos{0,l)d't{0), 

where (a) is by (4.45), and (b) by (4.44). Combining the equations, 

Covct(5s, St\crQ = l) - Covai5s, 6t\aQ = 0) 
= Dos[DstPosiO,0) - DotKiO)d'M 
+ [DosDsti'M 0) - DotE46s I ao = 0)] <(0). 

In particular, by letting s = t, we get 

Var,(5t I ao = 1) - Var,(5t | ao = 0) = Dot[i^ot(l, 0) - Pot(0, 1)]K(0)]^ 

Combining all the above formulas and letting n — > oo, then (3.4) follows. 

Proof of Proposition 3.2. Let A(x,'(?) = ln/(a;,'i?). Given t, Z and ry, it{e,a) is a 
composite of functions A(x,'i?), ip{Zt,v), Oa^e) and Orf^{e), such that 

^t(e,a) =A((/.(Zt,^,,,(e)),^,(e)), 

so by the chain rule for differentiation, 

, dX{x,i^)dip{Zt,v) ^, , d\{x,i^) , 
= 5^^^*^^) + 

where the right hand side is evaluated at x = ip{Zt,v), v = Ojj^{e), and i? = Oa{e). 
Since 0i(O) = 6q{0) = 0, the first summand on the right hand side takes the same 
value for a = 0, 1. Therefore, (3.8) holds. 
Likewise, 



dv 



where again the right hand side is evaluated at x = ip{Zt,v), v = ^»yt(e), and 
^ = da{e). Then (3.9) follows. □ 
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Proof of Proposition 3.3. We shall first show for any t, 

E[4(0)h] =0, (4.46) 

Var[<(O)|7?] = [0',(O)-0^,(O)]V(O), (4.47) 

EK(0) I r?] = [e'M - emW^M - ^o(O) - ^UO)] J(0). (4.48) 

Take expectation conditional on r] on both sides of (3.8) to get 



At e = 0, this is equivalent to expectation with respect to Xt = y3(Z(,0), which 
has density /(x,0). By the property of score function for parametric models, (4.46) 
follows. With similar argument, (4.47) follows as well. 

Take expectation conditional on rj on both sides of (3.9). Again, by the property 
of score function. 



-[01(0)2- 0^,(0)2] J(0). 
To prove (4.48), it suffices to show 

"a2A(Xt,o) d^{Zt,Q) 



dxd'd 



dv 



dxd'd 



dv 



J(0), 



(4.49) 



where Xt = ip{Zt,0). Define 





9{v,Zt) 


dX{ip{Zt,v 






1 


df{^{Zt,v) 


,0) 




d^ 






u),0) d^ 




Observe that 


















dg{v,Zt) 




_ a2A(Xt,o) dip{Zt,Q) 








dv 


i)=0 


dxd'd 




dv 




Therefore 
















E 


Id'XiXt, 


0) dip{Zt,Q)' 


= E 


\dg{v,Zt) 




_ 0E[9{v,Zt)] 






dxd'd 


dv 




dv 




dv 


v=0 



dv^ 



1 



dfi^{Zt,v),o) 



f{v{Zt,v),0) 



d-d 



v=0 



By Assumption 1, ip(Zt,v) has density f{x,v). Therefore, the right hand side is 



d_ 

dv 



1 df{x,0) 
/(x,0) d^ 



f{x, v) dx 



v=Q 



1 



f{x,0) 



dfix,0) 
d^ 



dx = J(0), 
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showing (4.49). 

From (3.3) and (4.46), 

oo 

E[r'{0)\v]=J2DotE[d',{0)\v]=0 

£=1 

showing (3.10). From (3.4), 

oo 

E[r"(0) I v]=Yl {EK'(O) I V] + [Potil, 0) - Pot(0, 1)]E[<(0)2 | r,]} 



t=i 



t=i 



T] 



Since given ry, Zt are independent, Ust are independent of d'f.{0). Therefore, (4.46) 
(4.48), 



Er (0) \v]=J2 ^ot {EK'(O) 1 7?] + [Pot(l, 0) - Poi(0, l)]Var[4(0) | ??]} 

oo 

= K(0)-^^(0)]J(0)J]Z)oi/t 



t=i 



where 



ft = 20^(0) - [Pot{i,i)+Pot{o,i)]e[io) - [Potiho) + Potio,o)]e'oio) 

= K(0) - 9'oiomvt - Potih 1) - Pot(0, 1)]. 

Therefore, (3.11) holds. Finahy, taking expectation conditional on r]Q on both sides 
of (3.11), we get (3.12). □ 
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Appendix 

In this Appendix, we make a general statement on the conditinal likehood under 
the FDR criterion. Let Hi, ... , Hn be a set of hypotheses being tested and let X be 
the available data. Let pi = Pr {Hi is false | X}. For any testing procedure based on 
X, let R be the total number of rejected Hi and V that of rejected true Hi. Then 
the number of rejected false nulls is R — V. 
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Proposition A.l Given a G (0,1), among all testing procedures whose rejection 
decisions are uniquely determined by the data X and which satisfy the FDR control 
criterion 

V 



FDR 



X 



< a, 



the following Benjamini-Hochberg procedure [4] has the largest E[R — V \ X]: 

1. sort qi = l- Pi into < q(^2) < ■■■ < q{n); 

2. let r = max{j : + • • • + < aj} and reject Hk if Qk < Q{r) ■ 

k = I. 



Proof. Given a procedure with R > 0, let Hi^,, 
nulls. Then as in [G], FDR = (qi^ + • • 



E[R-V\X]= R-qi 



<R 



. . . ,R be the rejected 

■ + qi^)/R > (9(1) H h q(R))/R, while 

Q{i) — ■ ■ ■ — q(R)- It is then not hard to see 



that under the FDR control criterion, the procedure in the Proposition attains the 
largest value ofE[R-V\X]. □ 
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